联邦学习文学中的许多假设存在于最实际应用中不能满足的最佳情况。异步设置反映了逼真的环境,其中联合学习方法必须能够可靠地运行。除了参与者的不同数量的非IID数据之外,由于可用的计算电源和电池约束,异步设置模拟异构客户端参与,并且还考虑了客户端和服务器之间的延迟通信。为了减少与异步在线联合学习(ASO Fed)相关的通信开销,我们使用基于部分共享的通信的原则。以这种方式,我们减少了参与者的通信负载,因此,渲染参与学习任务更可访问。我们证明了拟议的ASO供给的融合并提供了进一步分析其行为的模拟。模拟显示,在异步设置中,可以实现与联邦随机梯度(在线FEDSGD)相同的收敛,同时减少通信十倍。
translated by 谷歌翻译
This work focuses on unsupervised representation learning in person re-identification (ReID). Recent self-supervised contrastive learning methods learn invariance by maximizing the representation similarity between two augmented views of a same image. However, traditional data augmentation may bring to the fore undesirable distortions on identity features, which is not always favorable in id-sensitive ReID tasks. In this paper, we propose to replace traditional data augmentation with a generative adversarial network (GAN) that is targeted to generate augmented views for contrastive learning. A 3D mesh guided person image generator is proposed to disentangle a person image into id-related and id-unrelated features. Deviating from previous GAN-based ReID methods that only work in id-unrelated space (pose and camera style), we conduct GAN-based augmentation on both id-unrelated and id-related features. We further propose specific contrastive losses to help our network learn invariance from id-unrelated and id-related augmentations. By jointly training the generative and the contrastive modules, our method achieves new state-of-the-art unsupervised person ReID performance on mainstream large-scale benchmarks.
translated by 谷歌翻译
Targeted syntactic evaluations of language models ask whether models show stable preferences for syntactically acceptable content over minimal-pair unacceptable inputs. Most targeted syntactic evaluation datasets ask models to make these judgements with just a single context-free sentence as input. This does not match language models' training regime, in which input sentences are always highly contextualized by the surrounding corpus. This mismatch raises an important question: how robust are models' syntactic judgements in different contexts? In this paper, we investigate the stability of language models' performance on targeted syntactic evaluations as we vary properties of the input context: the length of the context, the types of syntactic phenomena it contains, and whether or not there are violations of grammaticality. We find that model judgements are generally robust when placed in randomly sampled linguistic contexts. However, they are substantially unstable for contexts containing syntactic structures matching those in the critical test content. Among all tested models (GPT-2 and five variants of OPT), we significantly improve models' judgements by providing contexts with matching syntactic structures, and conversely significantly worsen them using unacceptable contexts with matching but violated syntactic structures. This effect is amplified by the length of the context, except for unrelated inputs. We show that these changes in model performance are not explainable by simple features matching the context and the test inputs, such as lexical overlap and dependency overlap. This sensitivity to highly specific syntactic features of the context can only be explained by the models' implicit in-context learning abilities.
translated by 谷歌翻译
3D autonomous driving semantic segmentation using deep learning has become, a well-studied subject, providing methods that can reach very high performance. Nonetheless, because of the limited size of the training datasets, these models cannot see every type of object and scenes found in real-world applications. The ability to be reliable in these various unknown environments is called domain generalization. Despite its importance, domain generalization is relatively unexplored in the case of 3D autonomous driving semantic segmentation. To fill this gap, this paper presents the first benchmark for this application by testing state-of-the-art methods and discussing the difficulty of tackling LiDAR domain shifts. We also propose the first method designed to address this domain generalization, which we call 3DLabelProp. This method relies on leveraging the geometry and sequentiality of the LiDAR data to enhance its generalization performances by working on partially accumulated point clouds. It reaches a mIoU of 52.6% on SemanticPOSS while being trained only on SemanticKITTI, making it state-of-the-art method for generalization (+7.4% better than the second best method). The code for this method will be available on Github.
translated by 谷歌翻译
In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.
translated by 谷歌翻译
The extragradient method has recently gained increasing attention, due to its convergence behavior on smooth games. In $n$-player differentiable games, the eigenvalues of the Jacobian of the vector field are distributed on the complex plane, exhibiting more convoluted dynamics compared to classical (i.e., single player) minimization. In this work, we take a polynomial-based analysis of the extragradient with momentum for optimizing games with \emph{cross-shaped} Jacobian spectrum on the complex plane. We show two results. First, based on the hyperparameter setup, the extragradient with momentum exhibits three different modes of convergence: when the eigenvalues are distributed $i)$ on the real line, $ii)$ both on the real line along with complex conjugates, and $iii)$ only as complex conjugates. Then, we focus on the case $ii)$, i.e., when the eigenvalues of the Jacobian have \emph{cross-shaped} structure, as observed in training generative adversarial networks. For this problem class, we derive the optimal hyperparameters of the momentum extragradient method, and show that it achieves an accelerated convergence rate.
translated by 谷歌翻译
计算优化问题解决方案解决方案的雅各布是机器学习中的一个核心问题,其应用程序在超参数优化,元学习,优化为层和数据集蒸馏中的应用程序,仅举几例。展开的分化是一种流行的启发式方法,它使用迭代求解器近似溶液,并通过计算路径区分它。这项工作提供了对梯度下降和Chebyshev方法的二次目标的这种方法的非反应收敛速率分析。我们表明,为了确保雅各布的融合,我们可以1)选择较大的学习率,导致快速渐近地收敛,但接受该算法可能具有任意长的燃烧阶段或2)选择较小的学习率直接但较慢的收敛性。我们将这种现象称为展开的诅咒。最后,我们讨论了相对于这种方法的开放问题,例如为最佳展开策略得出实用的更新规则,并与Sobolev正交多项式领域建立了新的联系。
translated by 谷歌翻译
语音活动检测(VAD)旨在检测音频信号上的语音段,这对于许多今天的基于语音的应用程序来说是必要的第一步。当前的最新方法着重于训练直接包含声学中包含的神经网络,例如MEL Filter Basks(MFBS)。因此,此类方法需要一个额外的归一化步骤,以适应影响声学的新领域,这可能仅仅是由于说话者,麦克风或环境的变化所致。此外,这个归一化步骤通常是一种具有一定局限性的基本方法,例如高度容易受到新域可用的数据量。在这里,我们利用了众包共同的声音(CV)语料库,以表明基于自我监督学习(SSL)的表示形式可以很好地适应不同的领域,因为它们是通过跨多个领域的语音表达来计算的。 SSL表示也比基于手工制作的表示(MFB)和现成的VAD的系统获得更好的结果,并在跨域设置方面有了显着改善。
translated by 谷歌翻译
大多数自动情绪识别系统利用情绪的时间连续注释,以提供对自发表达的细粒度描述,如现实生活中所观察到的那样。由于情感是相当主观的,因此通常由几个注释者执行的注释,这些注释为给定维度提供痕迹,即描述诸如唤醒或价值之类的维度的时间连续系列。但是,相同表达式的注释在时间或价值之间很少一致,这增加了用于学习情感预测模型的迹线的偏见和延迟。因此,我们提出了一种可以动态补偿注释之间的矛盾的方法,并使用复发性神经网络将痕迹与相应的声学特征同步。进行了几个情绪数据集进行实验评估,其中包括中文,法语,德语和匈牙利参与者,他们在无噪声条件或野外进行远程互动。结果表明,对于唤醒和价值,我们的方法可以显着增加通道间的一致性以及迹线和音频特征之间的相关性。此外,在使用简单的轻量重量模型对这些维度的自动预测中获得了改进,尤其是在无噪声条件下的价值中,并唤醒了在野外捕获的记录。
translated by 谷歌翻译
对于适当的统计估计,数据集中的偏差可能非常有害。为了应对这个问题,已经开发了重要的加权方法,以将任何有偏分的分布与其相应的目标无偏分布相匹配。如今,开创性内核平均匹配(KMM)方法仍然被认为是该研究领域的最新技术。但是,该方法的主要缺点之一是大型数据集的计算负担。基于Huang等人的先前作品。 (2007)和De Mathelin等。 (2021),我们得出了一种新颖的重要性加权算法,该算法通过使用神经网络预测实例权重来扩展到大型数据集。我们在多个公共数据集上显示,在各种样本偏见下,我们提出的方法大大减少了大数据集上的计算时间,同时与其他重要的加权方法相比,保持了相似的样本偏差校正性能。所提出的方法似乎是唯一能够在合理时间内使用多达200万个数据的大型数据集进行相关重新加权的方法。
translated by 谷歌翻译